1.1 Probability

We have defined probability in STAT 201A note. However for a different course with (possibly) different notations, we will derive again.

1 What is Probability?

Probability

Mathematical probability is a function P mapping some subsets of a sample space X to [0,1], satisfying

  • P(X)=1,
  • P()=0,
  • (disjoint additivity) P(i=1Ai)=i=1P(Ai),AiAj=,ij.

2 Measure and Integrals

2.1 Measure

Given a set X, a measure μ is a certain kind of function mapping "nice enough" subsets AX to non-negative numbers μ(A)[0,+).

Generally, the domain of a measure is just a collection of "nice" subsets F2X. F should satisfy certain closure properties.
First we need definition of σ algebra. It's not important. See definition here.

Measure

Given measurable space (X,F), a measure μ is a map F[0,+] with disjoint additivity and μ()=0. μ is probability measure if μ(X)=1.

2.2 Integral

Integral

An integral w.r.t μ puts weight μ(A) on A. Define A1{xA}dμ(x)=μ(A). We can extend to other functions by linearity c1A(x)dμ(x)=cμ(A)i=1nci1Ai(x)dμ(x)=ci=1nμ(Ai),
and limits fdμ=limnfn(x)dμ.
Pasted image 20241201225736.png|600

3 Densities

A measure P is absolutely continuous w.r.t μ, if P(A)=0 whenever μ(A)=0. Denote as Pμ or μ dominates P.
If Pμ, then define density function p:X[0,+) s.t. P(A)=1A(x)p(x)dμ(x),AF, and by extension f(x)dP(x)=f(x)p(x)dμ(x). Density function p is also called Radon-Nikodym derivative of P w.r.t μ, and is sometimes written as dPdμ(x).

If we don't specify μ, it is the Lebesgue measure. I.e., P is absolutely continuous in default means Pλ.

If P is probability, and μ is Lebesgue measure, then p is called probability density function (p.d.f);
Elif μ is counting measure, p is called probability mass function (p.m.f).

4 Probability Spaces, Random Variables

Denote the outcome space Ω. To evaluate whatever P(X,), it is convenient to start with abstract outcome ωΩ.

Probability Space

We have (Ω,F,P) a probability space.

  • ωΩ is called outcome.
  • AΩ is called event.
  • P(A) is called probability of event A.
  • Function X:ΩX is called random variable X(ω).
  • We say X has distribution Q (defined as XQ) if P(XB)=Q(B).

Q(B) is the push-forward of P through function X(ω): Q(B)PX1(B).

Applied more generally, if μ is a measure on X, f:XY leads to new measure ν(B)Y=μf1(B)X. (f1 denotes the preimage)

If P(A)=1, we say A happens almost surely.